Calibration#

The Calibration class provides a way to adjust weights of observations in a dataset to match specified target values. This is commonly used in survey research and policy modeling for rebalancing datasets to better represent desired population characteristics.

The calibration process uses an optimization algorithm to find weights that minimize the distance from the original weights while achieving the target constraints.

Basic usage#

Parameters#

__init__(data, weights, targets)

  • data (pd.DataFrame): The dataset to be calibrated. This should contain all the variables you want to use for calibration.

  • weights (np.ndarray): Initial weights for each observation in the dataset. Typically starts as an array of ones for equal weighting.

  • targets (np.ndarray): Target values that the calibration process should achieve. These correspond to the desired weighted sums.

Calibration can be easily done by initializing the Calibration class, passing in the parameters above. Then calibrate() method performs the actual calibration using the reweight function. This method:

  • Adjusts the weights to better match the target values

  • May subsample the data for efficiency

  • Updates both self.weights and self.data with the calibrated results

Example#

Below is a complete example showing how to calibrate a dataset to match income targets for specific age groups:

from microcalibrate.calibration import Calibration
import logging
import numpy as np
import pandas as pd
import plotly.graph_objs as go
from plotly.subplots import make_subplots

logging.basicConfig(
    level=logging.INFO,
)

# Create a sample dataset with age and income data
random_generator = np.random.default_rng(0)
data = pd.DataFrame({
    "age": np.append(random_generator.integers(18, 70, size=120), 71), 
    "income": random_generator.normal(40000, 10000, size=121),
})

# Set initial weights (all one in this example)
weights = np.ones(len(data))

# Calculate target values: total income for age groups 20-30 and 40-50 (as an example) or employ existing targets
targets_matrix = pd.DataFrame({
    "income_aged_20_30": ((data["age"] >= 20) & (data["age"] <= 30)).astype(float) * data["income"],
    "income_aged_40_50": ((data["age"] >= 40) & (data["age"] <= 50)).astype(float) * data["income"],
    "income_aged_71" : (data["age"] == 71).astype(float) * data["income"],
})

# 15% higher than the sum of data with the original weights
targets = np.array([
    (targets_matrix["income_aged_20_30"] * weights * 1000).sum(), 
    (targets_matrix["income_aged_40_50"] * weights * 1.15).sum(), 
    (targets_matrix["income_aged_71"] * weights * 1.15).sum()
])

print(f"Original weights: {weights}")
print(f"Original targets: {targets}")
Original weights: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
 1.]
Original targets: [7.37032429e+08 9.76779350e+05 4.36479914e+04]
# Initialize the Calibration object
calibrator = Calibration(
    loss_matrix=targets_matrix,
    weights=weights, 
    targets=targets,
    noise_level=0.05,
    epochs=528,
    learning_rate=0.01,
    dropout_rate=0,
    subsample_every=0,
)

# Perform the calibration
performance_df = calibrator.calibrate()

print(f"Original dataset size: {len(targets_matrix)}")
print(f"Calibrated dataset size: {len(calibrator.loss_matrix)}")
print(f"Number of calibrated weights: {len(calibrator.weights)}")
INFO:microcalibrate.calibration:Performing basic target assessment...
WARNING:microcalibrate.calibration:Target income_aged_20_30 (7.37e+08) differs from initial estimate (7.37e+05) by 3.00 orders of magnitude.
WARNING:microcalibrate.calibration:Target income_aged_71 is supported by only 0.83% of records in the loss matrix. This may make calibration unstable or ineffective.
INFO:microcalibrate.reweight:Starting calibration process for targets ['income_aged_20_30' 'income_aged_40_50' 'income_aged_71']: [7.37032429e+08 9.76779350e+05 4.36479914e+04]
INFO:microcalibrate.reweight:Original weights - mean: 1.0000, std: 0.0000
INFO:microcalibrate.reweight:Initial weights after noise - mean: 1.0242, std: 0.0141
Reweighting progress:   0%|          | 0/528 [00:00<?, ?epoch/s]
Reweighting progress:   0%|          | 0/528 [00:00<?, ?epoch/s, loss=0.34, count_observations=121, weights_mean=1.02, weights_std=0.0141, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 33.33% 
Reweighting progress:   0%|          | 0/528 [00:00<?, ?epoch/s, loss=0.333, count_observations=121, weights_mean=1.06, weights_std=0.0505, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 66.67% 
INFO:microcalibrate.reweight:Epoch   10: Loss = 0.332754, Change = 0.006961 (improving)
Reweighting progress:   0%|          | 0/528 [00:00<?, ?epoch/s, loss=0.333, count_observations=121, weights_mean=1.09, weights_std=0.092, weights_min=1] 
INFO:microcalibrate.reweight:Within 10% from targets: 66.67% 
INFO:microcalibrate.reweight:Epoch   20: Loss = 0.332984, Change = -0.000230 (worsening)
Reweighting progress:   0%|          | 0/528 [00:00<?, ?epoch/s, loss=0.332, count_observations=121, weights_mean=1.1, weights_std=0.131, weights_min=1] 
INFO:microcalibrate.reweight:Within 10% from targets: 66.67% 
INFO:microcalibrate.reweight:Epoch   30: Loss = 0.332441, Change = 0.000543 (improving)
Reweighting progress:   0%|          | 0/528 [00:00<?, ?epoch/s, loss=0.332, count_observations=121, weights_mean=1.12, weights_std=0.184, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 66.67% 
INFO:microcalibrate.reweight:Epoch   40: Loss = 0.332364, Change = 0.000077 (improving)
Reweighting progress:   0%|          | 0/528 [00:00<?, ?epoch/s, loss=0.332, count_observations=121, weights_mean=1.15, weights_std=0.249, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 66.67% 
INFO:microcalibrate.reweight:Epoch   50: Loss = 0.332184, Change = 0.000180 (improving)
Reweighting progress:   0%|          | 0/528 [00:00<?, ?epoch/s, loss=0.332, count_observations=121, weights_mean=1.19, weights_std=0.326, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 66.67% 
INFO:microcalibrate.reweight:Epoch   60: Loss = 0.332040, Change = 0.000144 (improving)
Reweighting progress:   0%|          | 0/528 [00:00<?, ?epoch/s, loss=0.332, count_observations=121, weights_mean=1.23, weights_std=0.417, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 66.67% 
INFO:microcalibrate.reweight:Epoch   70: Loss = 0.331862, Change = 0.000178 (improving)
Reweighting progress:   0%|          | 0/528 [00:00<?, ?epoch/s, loss=0.332, count_observations=121, weights_mean=1.27, weights_std=0.527, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 66.67% 
INFO:microcalibrate.reweight:Epoch   80: Loss = 0.331657, Change = 0.000206 (improving)
Reweighting progress:   0%|          | 0/528 [00:00<?, ?epoch/s, loss=0.331, count_observations=121, weights_mean=1.33, weights_std=0.658, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 66.67% 
INFO:microcalibrate.reweight:Epoch   90: Loss = 0.331411, Change = 0.000246 (improving)
Reweighting progress:   0%|          | 0/528 [00:00<?, ?epoch/s, loss=0.331, count_observations=121, weights_mean=1.39, weights_std=0.817, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 66.67% 
INFO:microcalibrate.reweight:Epoch  100: Loss = 0.331115, Change = 0.000296 (improving)
Reweighting progress:   0%|          | 0/528 [00:00<?, ?epoch/s, loss=0.331, count_observations=121, weights_mean=1.47, weights_std=1.01, weights_min=1] 
INFO:microcalibrate.reweight:Within 10% from targets: 66.67% 
INFO:microcalibrate.reweight:Epoch  110: Loss = 0.330755, Change = 0.000360 (improving)
Reweighting progress:   0%|          | 0/528 [00:00<?, ?epoch/s, loss=0.33, count_observations=121, weights_mean=1.57, weights_std=1.25, weights_min=1] 
INFO:microcalibrate.reweight:Within 10% from targets: 66.67% 
INFO:microcalibrate.reweight:Epoch  120: Loss = 0.330315, Change = 0.000440 (improving)
Reweighting progress:  23%|██▎       | 122/528 [00:00<00:00, 1218.15epoch/s, loss=0.33, count_observations=121, weights_mean=1.57, weights_std=1.25, weights_min=1]
Reweighting progress:  23%|██▎       | 122/528 [00:00<00:00, 1218.15epoch/s, loss=0.33, count_observations=121, weights_mean=1.69, weights_std=1.54, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 66.67% 
INFO:microcalibrate.reweight:Epoch  130: Loss = 0.329773, Change = 0.000542 (improving)
Reweighting progress:  23%|██▎       | 122/528 [00:00<00:00, 1218.15epoch/s, loss=0.329, count_observations=121, weights_mean=1.84, weights_std=1.9, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 66.67% 
INFO:microcalibrate.reweight:Epoch  140: Loss = 0.329101, Change = 0.000672 (improving)
Reweighting progress:  23%|██▎       | 122/528 [00:00<00:00, 1218.15epoch/s, loss=0.328, count_observations=121, weights_mean=2.03, weights_std=2.35, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 66.67% 
INFO:microcalibrate.reweight:Epoch  150: Loss = 0.328261, Change = 0.000840 (improving)
Reweighting progress:  23%|██▎       | 122/528 [00:00<00:00, 1218.15epoch/s, loss=0.327, count_observations=121, weights_mean=2.27, weights_std=2.92, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 66.67% 
INFO:microcalibrate.reweight:Epoch  160: Loss = 0.327205, Change = 0.001056 (improving)
Reweighting progress:  23%|██▎       | 122/528 [00:00<00:00, 1218.15epoch/s, loss=0.326, count_observations=121, weights_mean=2.57, weights_std=3.65, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 66.67% 
INFO:microcalibrate.reweight:Epoch  170: Loss = 0.325868, Change = 0.001337 (improving)
Reweighting progress:  23%|██▎       | 122/528 [00:00<00:00, 1218.15epoch/s, loss=0.324, count_observations=121, weights_mean=2.96, weights_std=4.57, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 66.67% 
INFO:microcalibrate.reweight:Epoch  180: Loss = 0.324164, Change = 0.001704 (improving)
Reweighting progress:  23%|██▎       | 122/528 [00:00<00:00, 1218.15epoch/s, loss=0.322, count_observations=121, weights_mean=3.45, weights_std=5.76, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 66.67% 
INFO:microcalibrate.reweight:Epoch  190: Loss = 0.321979, Change = 0.002185 (improving)
Reweighting progress:  23%|██▎       | 122/528 [00:00<00:00, 1218.15epoch/s, loss=0.319, count_observations=121, weights_mean=4.09, weights_std=7.3, weights_min=1] 
INFO:microcalibrate.reweight:Within 10% from targets: 66.67% 
INFO:microcalibrate.reweight:Epoch  200: Loss = 0.319161, Change = 0.002818 (improving)
Reweighting progress:  23%|██▎       | 122/528 [00:00<00:00, 1218.15epoch/s, loss=0.316, count_observations=121, weights_mean=4.93, weights_std=9.31, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 66.67% 
INFO:microcalibrate.reweight:Epoch  210: Loss = 0.315507, Change = 0.003653 (improving)
Reweighting progress:  23%|██▎       | 122/528 [00:00<00:00, 1218.15epoch/s, loss=0.311, count_observations=121, weights_mean=6.02, weights_std=11.9, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 66.67% 
INFO:microcalibrate.reweight:Epoch  220: Loss = 0.310750, Change = 0.004757 (improving)
Reweighting progress:  23%|██▎       | 122/528 [00:00<00:00, 1218.15epoch/s, loss=0.305, count_observations=121, weights_mean=7.47, weights_std=15.4, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 66.67% 
INFO:microcalibrate.reweight:Epoch  230: Loss = 0.304535, Change = 0.006216 (improving)
Reweighting progress:  23%|██▎       | 122/528 [00:00<00:00, 1218.15epoch/s, loss=0.296, count_observations=121, weights_mean=9.38, weights_std=20, weights_min=1]  
INFO:microcalibrate.reweight:Within 10% from targets: 66.67% 
INFO:microcalibrate.reweight:Epoch  240: Loss = 0.296395, Change = 0.008140 (improving)
Reweighting progress:  46%|████▋     | 245/528 [00:00<00:00, 1224.88epoch/s, loss=0.296, count_observations=121, weights_mean=9.38, weights_std=20, weights_min=1]
Reweighting progress:  46%|████▋     | 245/528 [00:00<00:00, 1224.88epoch/s, loss=0.286, count_observations=121, weights_mean=11.9, weights_std=26.1, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 66.67% 
INFO:microcalibrate.reweight:Epoch  250: Loss = 0.285729, Change = 0.010665 (improving)
Reweighting progress:  46%|████▋     | 245/528 [00:00<00:00, 1224.88epoch/s, loss=0.272, count_observations=121, weights_mean=15.3, weights_std=34.3, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 66.67% 
INFO:microcalibrate.reweight:Epoch  260: Loss = 0.271781, Change = 0.013948 (improving)
Reweighting progress:  46%|████▋     | 245/528 [00:00<00:00, 1224.88epoch/s, loss=0.254, count_observations=121, weights_mean=19.9, weights_std=45.3, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 66.67% 
INFO:microcalibrate.reweight:Epoch  270: Loss = 0.253636, Change = 0.018146 (improving)
Reweighting progress:  46%|████▋     | 245/528 [00:00<00:00, 1224.88epoch/s, loss=0.23, count_observations=121, weights_mean=26, weights_std=60, weights_min=1]     
INFO:microcalibrate.reweight:Within 10% from targets: 66.67% 
INFO:microcalibrate.reweight:Epoch  280: Loss = 0.230272, Change = 0.023363 (improving)
Reweighting progress:  46%|████▋     | 245/528 [00:00<00:00, 1224.88epoch/s, loss=0.201, count_observations=121, weights_mean=34.2, weights_std=79.7, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 66.67% 
INFO:microcalibrate.reweight:Epoch  290: Loss = 0.200728, Change = 0.029544 (improving)
Reweighting progress:  46%|████▋     | 245/528 [00:00<00:00, 1224.88epoch/s, loss=0.164, count_observations=121, weights_mean=45.2, weights_std=106, weights_min=1] 
INFO:microcalibrate.reweight:Within 10% from targets: 66.67% 
INFO:microcalibrate.reweight:Epoch  300: Loss = 0.164481, Change = 0.036247 (improving)
Reweighting progress:  46%|████▋     | 245/528 [00:00<00:00, 1224.88epoch/s, loss=0.122, count_observations=121, weights_mean=59.6, weights_std=141, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 66.67% 
INFO:microcalibrate.reweight:Epoch  310: Loss = 0.122212, Change = 0.042269 (improving)
Reweighting progress:  46%|████▋     | 245/528 [00:00<00:00, 1224.88epoch/s, loss=0.0771, count_observations=121, weights_mean=78.1, weights_std=185, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 66.67% 
INFO:microcalibrate.reweight:Epoch  320: Loss = 0.077057, Change = 0.045155 (improving)
Reweighting progress:  46%|████▋     | 245/528 [00:00<00:00, 1224.88epoch/s, loss=0.036, count_observations=121, weights_mean=101, weights_std=240, weights_min=1]  
INFO:microcalibrate.reweight:Within 10% from targets: 66.67% 
INFO:microcalibrate.reweight:Epoch  330: Loss = 0.035951, Change = 0.041106 (improving)
Reweighting progress:  46%|████▋     | 245/528 [00:00<00:00, 1224.88epoch/s, loss=0.00874, count_observations=121, weights_mean=126, weights_std=299, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 66.67% 
INFO:microcalibrate.reweight:Epoch  340: Loss = 0.008740, Change = 0.027211 (improving)
Reweighting progress:  46%|████▋     | 245/528 [00:00<00:00, 1224.88epoch/s, loss=0.000146, count_observations=121, weights_mean=147, weights_std=350, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 100.00% 
INFO:microcalibrate.reweight:Epoch  350: Loss = 0.000146, Change = 0.008595 (improving)
Reweighting progress:  46%|████▋     | 245/528 [00:00<00:00, 1224.88epoch/s, loss=0.000684, count_observations=121, weights_mean=156, weights_std=373, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 100.00% 
INFO:microcalibrate.reweight:Epoch  360: Loss = 0.000684, Change = -0.000539 (worsening)
Reweighting progress:  70%|██████▉   | 368/528 [00:00<00:00, 1217.31epoch/s, loss=0.000684, count_observations=121, weights_mean=156, weights_std=373, weights_min=1]
Reweighting progress:  70%|██████▉   | 368/528 [00:00<00:00, 1217.31epoch/s, loss=0.000525, count_observations=121, weights_mean=156, weights_std=371, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 100.00% 
INFO:microcalibrate.reweight:Epoch  370: Loss = 0.000525, Change = 0.000159 (improving)
Reweighting progress:  70%|██████▉   | 368/528 [00:00<00:00, 1217.31epoch/s, loss=4.79e-5, count_observations=121, weights_mean=151, weights_std=361, weights_min=1] 
INFO:microcalibrate.reweight:Within 10% from targets: 100.00% 
INFO:microcalibrate.reweight:Epoch  380: Loss = 0.000048, Change = 0.000477 (improving)
Reweighting progress:  70%|██████▉   | 368/528 [00:00<00:00, 1217.31epoch/s, loss=9.02e-6, count_observations=121, weights_mean=149, weights_std=355, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 100.00% 
INFO:microcalibrate.reweight:Epoch  390: Loss = 0.000009, Change = 0.000039 (improving)
Reweighting progress:  70%|██████▉   | 368/528 [00:00<00:00, 1217.31epoch/s, loss=2.12e-5, count_observations=121, weights_mean=148, weights_std=354, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 100.00% 
INFO:microcalibrate.reweight:Epoch  400: Loss = 0.000021, Change = -0.000012 (worsening)
Reweighting progress:  70%|██████▉   | 368/528 [00:00<00:00, 1217.31epoch/s, loss=5.14e-6, count_observations=121, weights_mean=149, weights_std=356, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 100.00% 
INFO:microcalibrate.reweight:Epoch  410: Loss = 0.000005, Change = 0.000016 (improving)
Reweighting progress:  70%|██████▉   | 368/528 [00:00<00:00, 1217.31epoch/s, loss=4.35e-10, count_observations=121, weights_mean=150, weights_std=357, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 100.00% 
INFO:microcalibrate.reweight:Epoch  420: Loss = 0.000000, Change = 0.000005 (improving)
Reweighting progress:  70%|██████▉   | 368/528 [00:00<00:00, 1217.31epoch/s, loss=6.66e-7, count_observations=121, weights_mean=150, weights_std=358, weights_min=1] 
INFO:microcalibrate.reweight:Within 10% from targets: 100.00% 
INFO:microcalibrate.reweight:Epoch  430: Loss = 0.000001, Change = -0.000001 (worsening)
Reweighting progress:  70%|██████▉   | 368/528 [00:00<00:00, 1217.31epoch/s, loss=3.06e-7, count_observations=121, weights_mean=150, weights_std=357, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 100.00% 
INFO:microcalibrate.reweight:Epoch  440: Loss = 0.000000, Change = 0.000000 (improving)
Reweighting progress:  70%|██████▉   | 368/528 [00:00<00:00, 1217.31epoch/s, loss=7.71e-9, count_observations=121, weights_mean=150, weights_std=357, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 100.00% 
INFO:microcalibrate.reweight:Epoch  450: Loss = 0.000000, Change = 0.000000 (improving)
Reweighting progress:  70%|██████▉   | 368/528 [00:00<00:00, 1217.31epoch/s, loss=1.85e-8, count_observations=121, weights_mean=150, weights_std=357, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 100.00% 
INFO:microcalibrate.reweight:Epoch  460: Loss = 0.000000, Change = -0.000000 (worsening)
Reweighting progress:  70%|██████▉   | 368/528 [00:00<00:00, 1217.31epoch/s, loss=1.47e-8, count_observations=121, weights_mean=150, weights_std=357, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 100.00% 
INFO:microcalibrate.reweight:Epoch  470: Loss = 0.000000, Change = 0.000000 (improving)
Reweighting progress:  70%|██████▉   | 368/528 [00:00<00:00, 1217.31epoch/s, loss=9.88e-10, count_observations=121, weights_mean=150, weights_std=357, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 100.00% 
INFO:microcalibrate.reweight:Epoch  480: Loss = 0.000000, Change = 0.000000 (improving)
Reweighting progress:  93%|█████████▎| 490/528 [00:00<00:00, 1213.07epoch/s, loss=9.88e-10, count_observations=121, weights_mean=150, weights_std=357, weights_min=1]
Reweighting progress:  93%|█████████▎| 490/528 [00:00<00:00, 1213.07epoch/s, loss=5.07e-10, count_observations=121, weights_mean=150, weights_std=357, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 100.00% 
INFO:microcalibrate.reweight:Epoch  490: Loss = 0.000000, Change = 0.000000 (improving)
Reweighting progress:  93%|█████████▎| 490/528 [00:00<00:00, 1213.07epoch/s, loss=6.38e-10, count_observations=121, weights_mean=150, weights_std=357, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 100.00% 
INFO:microcalibrate.reweight:Epoch  500: Loss = 0.000000, Change = -0.000000 (worsening)
Reweighting progress:  93%|█████████▎| 490/528 [00:00<00:00, 1213.07epoch/s, loss=6.2e-11, count_observations=121, weights_mean=150, weights_std=357, weights_min=1] 
INFO:microcalibrate.reweight:Within 10% from targets: 100.00% 
INFO:microcalibrate.reweight:Epoch  510: Loss = 0.000000, Change = 0.000000 (improving)
Reweighting progress:  93%|█████████▎| 490/528 [00:00<00:00, 1213.07epoch/s, loss=1.57e-11, count_observations=121, weights_mean=150, weights_std=357, weights_min=1]
INFO:microcalibrate.reweight:Within 10% from targets: 100.00% 
INFO:microcalibrate.reweight:Epoch  520: Loss = 0.000000, Change = 0.000000 (improving)
Reweighting progress: 100%|██████████| 528/528 [00:00<00:00, 1211.67epoch/s, loss=1.57e-11, count_observations=121, weights_mean=150, weights_std=357, weights_min=1]

INFO:microcalibrate.reweight:Reweighting completed. Final sample size: 121
Original dataset size: 121
Calibrated dataset size: 121
Number of calibrated weights: 121
# Calculate final weighted totals
final_totals = targets_matrix.mul(calibrator.weights, axis=0).sum().values

print(f"Target totals: {targets}")
print(f"Final calibrated totals: {final_totals}")
print(f"Difference: {final_totals - targets}")
print(f"Relative error: {(final_totals - targets) / targets * 100}")
Target totals: [7.37032429e+08 9.76779350e+05 4.36479914e+04]
Final calibrated totals: [7.37025329e+08 9.76778366e+05 4.36469951e+04]
Difference: [-7.09907483e+03 -9.84300091e-01 -9.96308503e-01]
Relative error: [-0.0009632  -0.00010077 -0.0022826 ]
np.testing.assert_allclose(
        final_totals,
        targets,
        rtol=0.01,  # relative tolerance
        err_msg="Calibrated totals do not match target values",
    )
performance_df.head()
epoch loss target_name target estimate error abs_error rel_abs_error
0 0 0.339715 income_aged_20_30 7.370324e+08 752617.687500 -7.362798e+08 7.362798e+08 0.998979
1 0 0.339715 income_aged_40_50 9.767794e+05 868822.000000 -1.079574e+05 1.079574e+05 0.110524
2 0 0.339715 income_aged_71 4.364799e+04 39512.902344 -4.135090e+03 4.135090e+03 0.094737
3 10 0.332754 income_aged_20_30 7.370324e+08 832004.750000 -7.362004e+08 7.362004e+08 0.998871
4 10 0.332754 income_aged_40_50 9.767794e+05 955435.812500 -2.134356e+04 2.134356e+04 0.021851
g20 = performance_df.query("target_name == 'income_aged_20_30'")
g40 = performance_df.query("target_name == 'income_aged_40_50'")

fig = make_subplots(
    rows=2, cols=2,
    subplot_titles=[
        "Estimate vs target: income_aged_20_30",
        "Estimate vs target: income_aged_40_50",
        "Relative absolute error: income_aged_20_30",
        "Relative absolute error: income_aged_40_50",
    ],
    shared_xaxes=True,
    vertical_spacing=0.12,
    horizontal_spacing=0.10,
)

fig.add_trace(
    go.Scatter(
        x=g20["epoch"], y=g20["target"],
        mode="lines", name="Target 20-30",
        line=dict(dash="dot", color="red"),
    ),
    row=1, col=1,
)
fig.add_trace(
    go.Scatter(
        x=g20["epoch"], y=g20["estimate"],
        mode="lines", name="Estimate 20-30",
        line=dict(color="blue"),
    ),
    row=1, col=1,
)

fig.add_trace(
    go.Scatter(
        x=g40["epoch"], y=g40["target"],
        mode="lines", name="Target 40-50",
        line=dict(dash="dot", color="red"),
    ),
    row=1, col=2,
)
fig.add_trace(
    go.Scatter(
        x=g40["epoch"], y=g40["estimate"],
        mode="lines", name="Estimate 40-50",
        line=dict(color="green"),
    ),
    row=1, col=2,
)

fig.add_trace(
    go.Scatter(
        x=g20["epoch"], y=g20["rel_abs_error"],
        mode="lines", showlegend=False,
        line=dict(color="blue"),
    ),
    row=2, col=1,
)

fig.add_trace(
    go.Scatter(
        x=g40["epoch"], y=g40["rel_abs_error"],
        mode="lines", showlegend=False,
        line=dict(color="green"),
    ),
    row=2, col=2,
)

fig.update_layout(
    height=800, width=1050,
    title_text="Calibration performance over epochs",
    legend=dict(x=1.05, y=1, xanchor="left", yanchor="top"),
    margin=dict(r=200),
)
fig.update_xaxes(title_text="Epoch", row=2, col=1)
fig.update_xaxes(title_text="Epoch", row=2, col=2)
fig.update_yaxes(title_text="Income ($)", row=1, col=1)
fig.update_yaxes(title_text="Income ($)", row=1, col=2)
fig.update_yaxes(title_text="Relative absolute error", row=2, col=1)
fig.update_yaxes(title_text="Relative absolute error", row=2, col=2)

fig.show()
summary = calibrator.summary()
summary
Metric Official target Final estimate Relative error
0 income_aged_20_30 7.370324e+08 7.370274e+08 -0.000007
1 income_aged_40_50 9.767794e+05 9.767784e+05 -0.000001
2 income_aged_71 4.364799e+04 4.364699e+04 -0.000023